Stochastic self-assembly of incommensurate clusters 
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We examine the classic problem of homogeneous nucleation and growth by deriving and ana- 
lyzing a fully discrete stochastic master equation. Upon comparison with results obtained from 
the corresponding mean-field Becker-Doring equations we find striking differences between the two 
corresponding equilibrium mean cluster concentrations. These discrepancies depend primarily on 
the divisibility of the total available mass by the maximum allowed cluster size, and the remainder. 
When such mass incommensurability arises, a single remainder particle can "emulsify" or "disperse" 
the system by significantly broadening the mean cluster size distribution. This finite-sized broad- 
ening effect is periodic in the total mass of the system and can arise even when the system size is 
asymptotically large, provided the ratio of the total mass to the maximum cluster size is finite. For 
such finite ratios we show that homogeneous nucleation in the limit of large, closed systems is not 
accurately described by classical mean-field mass-action approaches. 

PACS numbers: 05.40.-a, 05.10.Gg, 64.75.Yz 



Nucleation and growth arise in countless physical and 
biological settings P. In surface and material science, 
atoms and molecules may nucleate to form islands and 
multiphase structures that strongly affect overall mate- 
rial properties 0. Nucleation and growth are also ubiq- 
uitous in cellular biology. The polymerization of actin 
filaments 0] and amyloid fibrils Q , the assembly of virus 
capsids and of antimicrobial peptides into transmem- 
brane pores 0, the recruitment of transcription factors, 
and the nucleation of clathrin-coated pits [7| are all im- 
portant cell-level processes that can be cast as problems 
of nucleation and growth for which there is great interest 
in developing theoretical tools. Classical models of nu- 
cleation and growth include mass-action kinetics, such as 
the Bccker-Doring (BD) equations describing the evolu- 
tion of the mean concentrations of clusters of a given 
size [H, or models of independent clusters [8[. Solu- 
tions to the BD equations exhibit rich behavior, including 
metastable particle distributions [§] , multiple time scales 
[lo| , and nontrivial convergence to equilibrium and coars- 
ening [ll|. Within mean-field, mass-action treatments 
however, correlations, discreteness or stochastic effects 
arc not included. These may be important, especially in 
applications to cell biology and nanotechnology, where 
small system sizes or finite cluster "stoichiometry" are 
involved. 

In this paper, we carefully investigate the effects of dis- 
creteness and stochasticity for a simple, mass-conserving 
homogeneous nucleation process. We construct the prob- 
ability of the system to be in a state with specified num- 
bers of clusters of each size. A high-dimensional, fully 
stochastic master equation governing the evolution of the 
state probabilities is derived, simulated, and solved an- 
alytically in the equilibrium limit. Upon comparing the 
mean cluster concentrations found from the stochastic 
master equation with those obtained from the mean-field 
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FIG. 1: (a) Homogeneous nucleation in a fixed, closed, unit 
volume initiated with ni(t = 0) = M = 30 monomers. For 
small detachment rates monomers will be nearly exhausted 
at long times. Here, the final cluster distribution consists of 
two dimers, one trimer, one 4-mer, one pentamer, and two 
hexamers. 



BD equations, we find qualitative differences, even in the 
large system size limit. Our results highlight the impor- 
tance of discreteness in nucleation and growth, and how 
its inclusion leads to dramatically different results from 
those obtained via classical, mean-field BD equations. 

We begin by considering the simple homogeneous nu- 
cleation process in a closed system (Fig. [T]). Monomers 
first bind together to form dimers. Larger clusters are 
formed by successive monomer binding but can also 
shrink by monomer detachment. Within cellular bio- 
physics, nucleation and self-assembly often occur in small 
volumes. Here, monomer production/degradation may 
be slow compared to monomer attachment/detachment 
and the total number of monomers, both free and within 
clusters, can be assumed constant. Cluster sizes are also 
typically limited, either by the finite total mass of the 
system, or by some intrinsic stoichiometry. For exam- 
ple, virus capsids, clathrin coated pits, and antimicrobial 
peptide pores typically consist of iV ^ 100 — 1000, N ^ 
10 — 20, and iV 5 — 8 molecular subunits, respec- 
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tively. While various monomer binding and unbinding 
rate structures [§-12|, cluster fragmentation/ coa gula tion 
rules or the presence of monomer sources fiol. 14 1 
can be included, for the sake of simplicity we consider 
only monomer binding and unbinding events occurring 
at constant, cluster size-independent rates. 

Consider the probability density P({n};t) = 
P[ni,n2, . ■ ■ ,niq]t) of our system being in a state 
with Til monomers, 712 dimers, trimers, . . ., tin 
A^-mers. The full stochastic master equation describing 
the time evolution of P({n};t) is 

P(M;t) = -K{{n])P{{n}-t) 

+ ]^{ni+2){ni + l)W+W+W^P{{n];t) 

+e{n2 + l)W+W{W{P{{nY t) 

N-l 

+ 5] (m + l)(n, + l)W+W+Wr_^,P{{n};t) 

1=2 
N 

+eJ2in^ + l)W^Wri,W+P{{n}■,t). (1) 

i=3 

Here we non-dimensionalized time so that the binding 
rate is unity and the detachment rate is e. Since it 
best illustrates the importance of discreteness in self- 
assembly, we henceforth restrict ourselves to the strong 
binding limit e <C 1. We define A({n}) = ini(ni — 1) + 
^^2^71,171^ -|- e X]i^2 total rate out of configura- 

tion {n} and as the unit raising/lowering operator 
that act the number of clusters of size j. For exam- 
ple, W^W+Wrij^P{{n};t) = P({n'},i) where {n'} = 
(ni -|- 1, . . . , TT-i -|- 1, Tii+i — 1, . . .). We assume that all the 
mass is initially in the form of monomers: P{{n}]t = 
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By construction, the stochastic 



dynamics described by Eq.[T]obey the total mass conser- 
vation constraint M = X^^i ^'^fc- 

Solutions to Eq.[T]can be used to define quantities such 
as the mean numbers of clusters of size k: (nfc(t)) 

nfcP({n}; t). These mean numbers will be com- 
pared to the classical BD cluster concentrations Ck{t) ob- 
tained by directly multiplying Eq. [T] by and summing 
over all allowable configurations. This procedure leads 
to a hierarchy of equations relating the evolution of the 
mean (nk(t)) to higher moments such as {nj{t)nk(t)). 
Closure of these equations using the mean-field and 
large number approximations, {ukUj) ~ {nk){nj) and 
(ni(ni — 1)) ~ ("-i)^, leads to the classical Becker-Doring 
equations 



Ci(t) 

Ck{t) 
CN(t) 
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-C1C2 4- - ec2 + ec3 
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-CiCfc -I- CiCk-l 



ClCjV-1 - £Cn, 



where Cfc(t) is the mass- action approximation to (nfc(i)). 
Here, the corresponding initial condition and mass con- 
servation arc expressed as Cfc(t = 0) = M5k.i and 
M = Ylk=i kck{t), respectively. Eq.[2]can be easily inte- 
grated and analyzed at equilibrium in the e <C 1 limit 
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where c^j^ 



Ck(t — > 00). In equilibrium, mean- field BD 
theory predicts maximal clusters of size dominate with 



concentration c 



M/N, while cr<A, 



,1-fe/jv ^ as 



e — > 0"*". Under mass-action thus nearly all the mass 
is driven into the largest cluster. However, a simple in- 
consistency emerges since the solution cj^ « [M /N)5k,N 
cannot be accurate if M < iV, when there is insufficient 
mass to form a single maximal cluster. 

To further investigate this inconsistency, we simulate 
the fully stochastic Master equation ([T]) using a KMC 
or residence time algorithm [15| . Figure [2] plots mean 
cluster numbers {nk{t)) and mean-field results Ck{t) with 
iV = 8, M = 16, 17, and e = 10"^. Up to intermedi- 
ate times t < e^^, there is little difference between the 
results for M = 16 and M = 17 and the mass-action con- 
centrations Ck(t) roughly approximate {nk(t)). However, 
at long times t ^ e^^, striking differences arise between 
the M ^ 2N = 16 and M = 2N + I = 17 cases. We 
denote our solution in this limit as (n^^), to be compared 
with cj^. For the commensurate case M = 16 (Fig. [^^a)) 
the mass-action solution c^,'^ roughly approximates (n^f^), 
while for the incommensurate case M = 17, c^'^ differs 
dramatically from (nj^). Figurc[51Jc) highlights the differ- 
ences between cj^ and 



"'}, particularly for fc = = 8 



(red curves). The approximation c 



cq 



) is reason- 



able only when M is exactly divisible by N, or, when M 
is very large. In the latter case, the periodically-varying 



= mean cluster numbers (n^) 



other (n^'^jv 



as M 



00, while all 



To find analytic approximations to the equilibrium 
probabilities P{{n};t — > 00), we make use of the fact 
that detachment is slow. In the e <C 1 limit, the most 
highly weighted equilibrium configurations are those with 
the fewest total number of clusters. For each set {M, N}, 
we can thus enumerate the states with the lowest num- 
ber of clusters and use detailed balance to compute their 
relative weights. As an explicit example, consider the 
possible states for the simple case N = 4, M ^ 9 shown 
in Fig. [31 Here, nearly all the weight settles into states 
with the lowest number of clusters (A/'min = 3 here). 
Applying detailed balance between the A/'min = 3 and 
A/'min -1-1=4 states, neglecting corrections of 0(e), we 
find (ni) « (712) « 6/13, (ng) « 9/13, and (714) « 18/13. 
This process can be extended to general M and N 
and leads to simple analytic solutions. Upon defining 
M = aN — j where a denotes the maximum possible 
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(a) KMC vs BD, N=%, M= 16 (b) KMC vs BD, N=S, M= 17 (c) KMC, «=~, Ar=S 




FIG. 2: Late-time mean cluster sizes {ukit)) obtained from averaging 10^ KMC simulations of a stochastic nucleation process 
with e = 10~®. Only fc = 2,4, 6, 8 are displayed, (a) For N = 8 and M = 16, nearly all the mass is concentrated in (jig'') ~ 2 
at equilibrium, (b) For A'" = 8, M = 17, a much broader equilibrium mean cluster distribution arises. For comparison, the 
numerical solution for Ck{t) from the BD equations are displayed by the dashed curves. The simulation and mean-field results 
agree well with each other, but a dramatic difference arises at long times as equilibrium is approached, particularly when the 
total mass M is indivisible by A^. (c) The difference between cj^ and (nj^) (plotted in units of M/N) is highlighted as a function 
of M. The red dashed line corresponds to (which is nearly independent of M), while the open circles correspond to (ri^) 
found from Monte-Carlo simulation. Note that Cg"* ~ iP''^^) only when M is divisible by A'^ = 8, or when M/N — >■ oo. The 
filled red circles correspond to M = 16 and M = 17 as detailed in (a) and (b), respectively. A few other mean concentrations, 
{ri^^Q 7)1 along with the corresponding c^^^ 7 (dashed lines) are also plotted for reference. 
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FIG. 3: Configurations (ni, 712, ?J3, 1^14) for A^ = 4, M = 9. 
Only three distinct states with a minimum number of clus- 
ters A/'min = 3 arise. These are all connected by monomer 
attachment/detachment events to states with A/'min -1-1 = 4 
clusters. Applying detailed balance between them leads to 
their weights in the e — >■ 0^ limit. 



number of largest clusters, and < j < A^ — 1 represents 
the remainder of M /N ^ we arrive at one of our main re- 
sults: exact solutions to the expected equilibrium cluster 
numbers in the e — > 0+ limit: 



^(^ - i)jXj - 1) ■ • ■ (j 



(3) 



k + 1) 



{a + j-l){a+j-2)...{a + j-k-l)' 



These expressions are valid for < j < A^ — 1 and all k. 
In the special case j = A^ — 1, the total mass can also 



be expressed as M = crA^ - (A^ - 1) = (ct - l)N + 1 so 
that j = N — \ corresponds to adding a single monomer 
to a system with M = (ct — 1)N monomers. In this 
case, combinatoric factors of 2 that arise when monomers 
appear in the populated configurations must be taken 
into account leading to, for j = A^ — 1, 



2{N - 1)! 
D{a,N - 1) 

nLi(^-^)n. 



N-k-l 



{(7-2 + i) 



D{a,N - 1) 
D{a-l,N -I) 
D{a,N - 1) 



(4) 



where D{a,j) = jl + lliJii<^ + Eqs. [3] and H have 
been verified using extensive Monte-Carlo simulations. 

Fig. H plots (nl^) for N = 8 and varying M. Note that 
when M = 16,24,32 is divisible by A^ = 8 and j = 0, 
nearly all mass is deposited into the largest clusters, in 
agreement with the mass-action BD results. For cases 
where M is not an integer multiple of A^ and j > 0, there 
are remaining monomers that conspire to form smaller 
clusters. The number of ways this can happen may be 
large, generating a broad distribution of cluster sizes. For 
example, let us add a single monomer to the previously 
analyzed (Fig.llja)) state N = 8, M = 16 (a = 2, j = 0). 
When M = 17 (Fig. HJb)), Eq. Hcan be used by setting 
cr = 3 and j = A/ — 1 = 7. Note that by adding just a 
single monomer, the mean cluster size distribution which 
for M = 16 was concentrated into the largest cluster, 
disperses and nearly uniformly populates all cluster 
sizes. In our Af = 8, M = 17 example, (1, 0, 0, 0, 0, 0, 0, 2) 
is clearly one possible state with the lowest number of 
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FIG. 4: The equilibrium cluster numbers {n^} as e — >■ 
plotted as functions ofl<fe<iV = 8 and M. 
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TABLE L Accuracy and validity regimes for equilibrium clus- 
ter numbers of different nucleation models for e <^ 1. Results 
indicated by * or by f match in the e — >■ 0^ limit, but ap- 
proach their common result very differently in e. 



ferent ways of taking the large system limits M, N ^ oo 
are considered. The first column in Table |T] with N = oo 
and M finite corresponds to nucleation with unbounded 
cluster sizes. All models yield a single cluster of size M, 
but display different scaling behavior in e (not discussed 
here). In the other extreme where M/N ^ N, equi- 
librium results from the finite— iV BD equations match 
those of the discrete stochastic model and all the mass is 
concentrated into clusters of maximal size. However, just 
as before, the results from the mass-action and stochas- 
tic treatments approach their common distribution very 
differently in e. The essential result described in our 
work applies in the intermediate case where M/N is fi- 
nite, as summarized in the middle column of Table ID 
Here we find the novel incommensurability effect high- 
lighted in Figs. [5fc) and ID These effects persist even in 
the M, N ^ oo limits, as long as their ratio is kept fixed. 
Our findings indicate that for many applications, where 
the effective M /N is finite, mean-field models of nucle- 
ation and growth fail and discrete stochastic treatments 
are required. 

This work was supported by the NSF through grants 
DMS-I032131 (TC), DMS-1021818 (TC), DMS-07i9462 
(MD), and DMS-1021850 (MD). TC is also supported by 
the Army Research Office through grant 58386MA. 



clusters A/'min = 3. However, as long as some dissociation 
is allowed (e > 0) a large number (in this case 7) 
of additional nontrivial 3-cluster states are possible: 
(0, 1, 0, 0, 0, 0, 1, 1),(0, 0, 0, 1, 0, 1, 1, 0),(0, 0, 1, 0, 0, 1, 0, 1), 
(0, 0, 0, 0, 2, 0, 1, 0),(0, 0, 0, 1, 1, 0, 0, 1),(0, 0, 1, 0, 0, 0, 2, 0), 
(0,0,0,0,1,2,0,0). The equilibrium weights of these 
8 new states are comparable, resulting in a very flat 
mean cluster size distribution, if compared to the 
AT = 8, M = 16 case. 

We can quantify this "dispersal" effect by calculating 
the expected cluster values (n^) in the incommensurate 
cases using Eqs. [3] and [H As shown in Figs. |4] and EJc), 
when M gets large, the dispersal effect diminishes. Recall 
that the BD mass-action result cj^ ~ {M/N)6k,N puts all 
nearly all mass into , which is consistent with the exact 
solution in Eq.Honly when N^/M < f . 

Thus, the mean-field result cj^ ^ {M/N)Sk^N is asymp- 
totically accurate only in the limit AI ^ A^^, or equiv- 
alently, when a ^ N. Thus, the periodically-varying 
curve {N/M){n^) in Fig. HJc) asymptotes to the mass- 
action result as M /N"^ — > oo. 

Finally, Table U lists regimes of validity and results for 
three different models: mass-action Becker-Doring equa- 
tions without an imposed maximum cluster size, Becker- 
Doring equations with a fixed finite maximum cluster size 
N , and the fully stochastic master equation. Three dif- 
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